Managing large datasets with iRODS - a performance analyses
نویسندگان
چکیده
The integrated Rule Orientated Data System (iRODS)[3] is a Grid data management system that organizes distributed data and their metadata. A Rule Engine allows a flexible definition of data storage, data access and data processing. This paper presents scenarios implemented in a benchmark tool to measure the performance of an iRODS environment as well as results of measurements with large datasets. The scenarios concentrate on data transfers, metadata transfers and stress tests. The user has the possibility to influence the scenarios to adapt them to his own use case. The results show the possibility to find bottlenecks and potential to optimize the settings of an iRODS environment.
منابع مشابه
Towards Implementing Virtual Data Infrastructures - a Case Study with iRODS
Scientists demand easy-to-use, scalable and flexible infrastructures for sharing, managing and processing their data spread over multiple resources accessible via different technologies and interfaces. In our previous work, we developed the conceptual framework VISPA for addressing these requirements. This paper provides a case study assessing the integrated Rule-Oriented Data System (iRODS) fo...
متن کاملEvent Processing in Policy Oriented Data Grids
Data Grids are used for managing massive amounts of data (Peta scale) that are distributed across heterogeneous storage systems. As such they are complex in nature and deal with multiple operations in the life-cycle of a data set from creation to usage to preservation to final disposition. Administering a data grid can be very challenging (not only for system administrators, but also for data p...
متن کاملA Pre-Trained Ensemble Model for Breast Cancer Grade Detection Based on Small Datasets
Background and Purpose: Nowadays, breast cancer is reported as one of the most common cancers amongst women. Early detection of the cancer type is essential to aid in informing subsequent treatments. The newest proposed breast cancer detectors are based on deep learning. Most of these works focus on large-datasets and are not developed for small datasets. Although the large datasets might lead ...
متن کاملA High-Performance Database System for Managing Large Multi-resolution Medical Images
In this work we address the design of a database system to explore, process, and visualize very large (multiterabyte) multi-resolution image datasets, obtained from MRI, CT and ultrasound, and digitized microscopy images. The basic requirements for such a database management system include (1) support for adding and managing user-defined processing functions, (2) managing datasets stored in dis...
متن کاملProceedings Front Cover 8X10
The integrated Rule Oriented Data Management System (iRODS) is a policy-driven data management system that is starting to be used by projects with large data volume requirements that require a highly available system. In this paper we describe an approach to provide a Highly Availability load-balanced iRODS System (HAIRS). We also describe the advantages and disadvantages of the approach and fu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010